A Parallel PageRank Algorithm with Power Iteration Acceleration
نویسندگان
چکیده
Based on the study about the basic idea of PageRank algorithm, combining with the MapReduce distributed programming concepts, the paper first proposed a parallel PageRank algorithm based on adjacency list which is suitable for massive data processing. Then, after examining the essential characteristics of iteration hidden behind the PageRank, it provided an iteration acceleration model based on vector computing. Following, using such acceleration model, the paper again brought forward a parallel PageRank algorithm with power iteration acceleration with MapReduce. Finally, after abundant experimental analyses, it has been proved that the both the two proposed algorithm can be suitable for massive data processing and the 2 nd one can significantly reduce the numbers of iteration and improve the efficiency of PageRank algorithm.
منابع مشابه
An Inner/Outer Stationary Iteration for Computing PageRank
We present a stationary iterative scheme for PageRank computation. The algorithm is based on a linear system formulation of the problem, uses inner/outer iterations, and amounts to a simple preconditioning technique. It is simple, can be easily implemented and parallelized, and requires minimal storage overhead. Convergence analysis shows that the algorithm is effective for a crude inner tolera...
متن کاملThe PageRank Vector: Properties, Computation, Approximation, and Acceleration
An important problem in Web search is to determine the importance of each page. After introducing the main characteristics of this problem, we will see that, from the mathematical point of view, it could be solved by computing the left principal eigenvector (the PageRank vector) of a matrix related to the structure of the Web by using the power method. We will give expressions of the PageRank v...
متن کاملPageRank algorithm and Monte Carlo methods in PageRank Computation
PageRank is the algorithm used by the Google search engine for ranking web pages. PageRank Algorithm calculates for each page a relative importance score which can be interpreted as the frequency of how often a page is visited by a surfer. The purpose of this work is to provide a mathematical analysis of the PageRank Algorithm. We analyze the random surfer model and the linear algebra behind it...
متن کاملFrogWild! - Fast PageRank Approximations on Graph Engines
We propose FrogWild, a novel algorithm for fast approxi-mation of high PageRank vertices, geared towards reducingnetwork costs of running traditional PageRank algorithms.Our algorithm can be seen as a quantized version of poweriteration that performs multiple parallel random walks overa directed graph. One important innovation is that we in-troduce a modification to the ...
متن کاملTechnical Report TR - 2012 - 018 Chebyshev Acceleration of the GeneRank Algorithm
The ranking of genes plays an important role in biomedical research. The GeneRank method of Morrison et al. [11] ranks genes based on the results of microarray experiments combined with gene expression information, for example from gene annotations. The algorithm is a variant of the well known PageRank iteration, and can be formulated as the solution of a large, sparse linear system. Here we sh...
متن کامل